面向微博热点话题发现的多标签传播聚类方法研究<sup>*</sup>

doi:10.16451/j.cnki.issn1003-6059.201501001

摘要
图/表
参考文献
相关文章 (7)

全文: PDF (568 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要微博热点话题发现是目前的研究热点.针对传统热词抽取方法难以适用于微博数据的问题，提出一种基于老化理论的词生命值计算模型用于热词抽取，并基于热词间的相关性构建词共现网络;针对传统的词聚类算法不能较好地解决话题间存在重叠热词以及时间效率不佳的问题，引入多标签传播思想，设计一种接近线性时间复杂度的多标签传播聚类算法(TCMLPA)用于词共现网络的热词聚类，获得热点话题集.实验结果表明，词生命值计算模型能够有效过滤噪声并提取热词，TCMLPA算法则能够在保证聚类结果稳定性的情况下，有效提高热点话题发现的精度和效率.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	陈羽中
	方明月
	郭文忠

关键词 ：微博, 热点话题发现, 老化理论, 热词抽取, 多标签传播

Abstract：With the rapid growth of microblog data, extracting hot topics from vast amounts of microblog posts has become a research hotspot. The traditional methods for hot term extraction can hardly apply to microblog data, thus a life value calculation model based on aging theory is established to extract hot terms. Then, a hot term co-occurrence network is built based on the correlations between hot terms. Aiming at the problem that traditional clustering methods can hardly handle the hot term overlap between different topics and can not deal with vast amounts of data efficiently, a term clustering method based on multi-label propagation algorithm (TCMLPA), which has a nearly linear time complexity, is proposed to detect hot topics in hot term co-occurrence network.The experimental results show that life value calculation model can filter noise and extract hot terms effectively. Meanwhile, TCMLPA ensures the stability of clustering result and improves the accuracy and efficiency of hot topic detection.

Key words： Microblog Hot Topic Detection Aging Theory Hot Term Extraction Multi-label Propagation

收稿日期: 2013-12-16

ZTFLH:

TP 391

基金资助:国家自然科学基金项目(No.61103175)、福建省教育厅重点项目(No.JK2012003)、福建省科技创新平台项目(No.2009J1007)、福建省自然科学基金项目(No.2013J01232)资助

作者简介: 陈羽中，男，1979年生，博士，副教授，主要研究方向为计算智能、复杂网络、数据挖掘等.E-mail:yzchen@fzu.edu.cn.方明月，女，1989年生，硕士研究生，主要研究方向为复杂网络、数据挖掘.郭文忠(通讯作者)，男，1979年生，博士，教授，主要研究方向为计算智能及其应用.E-mail:guowenzhong@fzu.edu.cn.

引用本文:

陈羽中，方明月，郭文忠. 面向微博热点话题发现的多标签传播聚类方法研究^*[J]. 模式识别与人工智能, 2015, 28(1): 1-10. CHEN Yu-Zhong, FANG Ming-Yue, GUO Wen-Zhong. Research on Multi-Label Propagation Clustering Method for Microblog Hot Topic Detection. , 2015, 28(1): 1-10.

链接本文:

http://manu46.magtech.com.cn/Jweb_prai/CN/10.16451/j.cnki.issn1003-6059.201501001 或 http://manu46.magtech.com.cn/Jweb_prai/CN/Y2015/V28/I1/1